Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

نویسندگان

Tommi S. Jaakkola

Satinder P. Singh

Michael I. Jordan

چکیده

Increasing attention has been paid to reinforcement learning algo rithms in recent years partly due to successes in the theoretical analysis of their behavior in Markov environments If the Markov assumption is removed however neither generally the algorithms nor the analyses continue to be usable We propose and analyze a new learning algorithm to solve a certain class of non Markov decision problems Our algorithm applies to problems in which the environment is Markov but the learner has restricted access to state information The algorithm involves a Monte Carlo pol icy evaluation combined with a policy improvement method that is similar to that of Markov decision problems and is guaranteed to converge to a local maximum The algorithm operates in the space of stochastic policies a space which can yield a policy that per forms considerably better than any deterministic policy Although the space of stochastic policies is continuous even for a discrete action space our algorithm is computationally tractable

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes

Satinder Singh AT &T Labs-Research 180 Park Avenue Florham Park, NJ 07932 [email protected] Partially Observable Markov Decision Processes (pO "MOPs) constitute an important class of reinforcement learning problems which present unique theoretical and computational difficulties. In the absence of the Markov property, popular reinforcement learning algorithms such as Q-Iearning may no long...

متن کامل

Reinforcement Learning with Partially Known World Dynamics

Reinforcement learning would enjoy better success on real-world problems if domain knowledge could be imparted to the algorithm by the modelers. Most problems have both hidden state and unknown dynamics. Partially observable Markov decision processes (POMDPs) allow for the modeling of both. Unfortunately, they do not provide a natural framework in which to specify knowledge about the domain dyn...

متن کامل

Reinforcement Learning for Problems with Hidden State

In this paper, we describe how techniques from reinforcement learning might be used to approach the problem of acting under uncertainty. We start by introducing the theory of partially observable Markov decision processes (POMDPs) to describe what we call hidden state problems. After a brief review of other POMDP solution techniques, we motivate reinforcement learning by considering an agent wi...

متن کامل

The Infinite Regionalized Policy Representation-0.1cm

We introduce the infinite regionalized policy presentation (iRPR), as a nonparametric policy for reinforcement learning in partially observable Markov decision processes (POMDPs). The iRPR assumes an unbounded set of decision states a priori, and infers the number of states to represent the policy given the experiences. We propose algorithms for learning the number of decision states while main...

متن کامل

A Reinforcement Learning System Based on State Space Construction Using Fuzzy ART A Reinforcement Learning System Based on State Space Construction Using Fuzzy ART

A new reinforcement learning system using fuzzy ART (adaptive resonance theory) is proposed. In the proposed method, fuzzy ART is used to classify observed information and to construct effective state space. Then, profit sharing is employed as a reinforcement learning method. Furthermore, the proposed system is extended to the hierarchical structures for solving partially observable Markov deci...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1994

Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

نویسندگان

چکیده

منابع مشابه

Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes

Reinforcement Learning with Partially Known World Dynamics

Reinforcement Learning for Problems with Hidden State

The Infinite Regionalized Policy Representation-0.1cm

A Reinforcement Learning System Based on State Space Construction Using Fuzzy ART A Reinforcement Learning System Based on State Space Construction Using Fuzzy ART

عنوان ژورنال:

اشتراک گذاری